The immune cell infiltration-associated molecular subtypes and gene signature predict prognosis for osteosarcoma patients

Host immune dysregulation involves in the initiation and development of osteosarcoma (OS). However, the exact role of immune cells in OS remains unknown. We aimed to distinguish the molecular subtypes and establish a prognostic model in OS patients based on immunocyte infiltration. The gene expression profile and corresponding clinical feature of OS patients were obtained from TARGET and GSE21257 datasets. MCP-counter and univariate Cox regression analyses were applied to identify immune cell infiltration-related molecular subgroups. Functional enrichment analysis and immunocyte infiltration analysis were performed between two subgroups. Furthermore, Cox regression and LASSO analyses were performed to establish the prognostic model for the prediction of prognosis and metastasis in OS patients. The subgroup with low infiltration of monocytic lineage (ML) was related to bad prognosis in OS patients. 435 DEGs were screened between the two subgroups. Functional enrichment analysis revealed these DEGs were involved in immune- and inflammation-related pathways. Three important genes (including TERT, CCDC26, and IL2RA) were identified to establish the prognostic model. The risk model had good prognostic performance for the prediction of metastasis and overall survival in OS patients. A novel stratification system was established based on ML-related signature. The risk model could predict the metastasis and prognosis in OS patients. Our findings offered a novel sight for the prognosis and development of OS.


The prognostic model was constructed based on monocytic lineage-related genes
First, we carried out the univariate Cox analysis to identify the survival-associated genes based on the above DEGs, and p < 0.05 as the cutoff value.The least absolute shrinkage and selection operator (LASSO) Cox regression analysis was carried out by using the "survival" and "glmnet" packages.The multivariate Cox regression analysis was performed to screen independent risk factors related to prognosis.The risk score for each patients was calculated based on the following formula: risk score = coefficient of Gene A × expression of gene A + coefficient of Gene N × expression of gene N 21 .The performance of the prognostic model was assessed through receiver operating characteristic curve and the Kaplan-Meier survival analysis.Based on the results of COX regression analysis, a nomogram was constructed to predict overall survival in OS patients.Nomogram and calibration curve were generated using the "ggplot2" and "survival R" packages.

GeneMANIA database
GeneMANIA (http:// genem ania.org) is a versatile and intuitive online platform that facilitates the generation of hypotheses pertaining to gene functionality 22 .To evaluate the potential functions of signature genes, GENE-MANIA was utilized to construct a gene interaction network.

Tumor immune single cell hub (TISCH)
The TISCH database (http:// tisch.comp-genom ics.org/ galle ry/) is an extensive compilation of single-cell RNA sequencing data.It offers valuable insights into the diverse nature of the tumor microenvironment.Leveraging this database, we embarked on an exploration of the heterogeneity within the tumor microenvironment across a range of cell types.

Statistical analysis
Univariate and multivariate Cox regression analyses were applied to screen prognostic biomarkers.The comparison between two groups was achieved by Mann-Whitney test.The overall survival between the two groups was analyzed by using Kaplan-Meier analysis.The statistical analysis was performed by using R software (v3.5.3), a p < 0.05 represented statistical difference.

ML was related to the prognosis of OS patients
In our study, the MCP-counter algorithm and univariate and univariate Cox regression analysis were performed to identify the survival-associated immune cells.The results of univariate Cox regression analysis showed that ML was a survival-related immune cell for OS patients (Fig. 1A).In addition, Kaplan-Meier analysis indicated that low level of ML was related to a poorer prognosis for OS patients (p = 0.033, Fig. 1B).

Analysis of DEGs and potential signaling pathways between two subgroups
A total of 435 DEGs were identified following DEGs analysis (Fig. 2A,B).Compared with the LML, 101 genes were downregulated and 334 genes were upregulated in the HML group.In the term of biological processes (BP), these DEGs were involved in regulation of lymphocyte activation, T cell activation, positive regulation of cell activation, regulation of T cell activation, leukocyte cell-cell adhesion, etc.In the term of cellular components (CC), DEGs were involved in external side of plasma membrane, secretory granule membrane, ficolin-1-rich granule, immunological synapse, NADPH oxidase complex, etc.In the term of molecular functions (MF), DEGs were significantly enriched in carbohydrate binding, cytokine receptor binding, cytokine binding, cytokine receptor activity, C-C chemokine receptor activity, etc.In the term of KEGG, DEGs were mainly enriched in hematopoietic cell lineage, osteoclast differentiation, cytokine-cytokine receptor interaction, B cell receptor signaling pathway, etc. (Fig. 3).

Immunocyte infiltration
We analyzed the immune status between the LML and HML subgroups to decipher the immune microenvironment in OS.As shown in Fig. 4A, compared to the HML group, the stromal score, immune score, and ESTIMATE
In the TARGET dataset, the expression level of TERT and CCDC26 was down-regulated in the low risk group, while the expression of IL2RA was up-regulated in the low risk group.Moreover, high risk group had a lower proportion of alive (Fig. 6A).OS patients in the low risk group exhibited longer overall survival than those in the high risk group (Fig. 6B, p < 0.001).The AUC for this prognostic model was 0.8 at 1-year, 0.87 at 3-year, and 0.85 at 5-year (Fig. 6C), this result indicated that the prognostic model had good diagnostic performance for OS patients.We also used the GSE21257 dataset to verify the diagnosis and prognostic features of the risk model.As shown in Fig. 7A, the expression level of TERT and CCDC26 was down-regulated in the low-risk group, while the expression of IL2RA was up-regulated in the low-risk group.Moreover, high risk group had a www.nature.com/scientificreports/lower proportion of alive.OS patients in the low-risk group exhibited longer overall survival than those in the high-risk group (p = 0.025) (Fig. 7B).The AUC for this prognostic model was 0.84 at 1-year, 0.67 at 3-year, and 0.68 at 5-year (Fig. 7C).These results were consistent with the results in the TARGET dataset.
In addition, a nomogram was established to further aid in predicting the prognosis of OS patients (Fig. 8A).The prediction results of the nomogram were highly consistent with the observation of OS patients based on the nomogram calibration curve (Fig. 8B).

Interaction analysis of prognostic genes
By employing the GeneMANIA database, we successfully built a protein interaction network for the signature genes (TERT and IL2RA).Through this analysis, we discovered a total of 20 genes that engage in interactions with the signature genes (Fig. 9A).Functional enrichment analysis was conducted on these 22 genes.The outcomes obtained from the enrichment analysis demonstrated that these genes are predominantly linked to the telomere organization, telomere maintenance, human T cell leukemia virus 1 infection, Th1 and Th2 cell differentiation, etc. (Fig. 9B).

Association between risk model-related genes and tumor microenvironment (TME)
We conducted an analysis on the expression levels of the TERT and IL2RA genes in tumor microenvironmentassociated cells within OS, using the TISCH database.Our findings demonstrated that IL2RA displayed a higher level of infiltration in cDC1, monocyte, and M2 cells (Fig. 10).

The prognostic model has the ability to distinguish metastatic OS patients
As shown Fig. 11A, compared to the high-risk group (TARGET, 58.54% and GSE21257, 17.25%), more no metastasis cases (TARGET, 90.25% and GSE21257, 58.34%) were observed in the low-risk group.Moreover, compared with the metastatic group, the risk score was lower in the non-metastatic group (p < 0.01, Fig. 11B,C).Furthermore, the results of ROC analysis showed that the diagnostic performance of the prognostic model for the prediction of metastasis were 0.659 and 0.705 in TARGET and GSE21257, respectively (Fig. 11D,E).These findings showed that the risk model could predict metastasis in OS patients.

Discussion
OS is a highly malignant cancer, and 80% of OS patients still died from metastasis 5,23,24 .Therefore, its treatment still faces great challenges.Recent studies have demonstrated that several diagnostic, management, and prognostic analyzes associated with immune cell populations provided clinical guidance for improving treatment outcomes in patients with OS [25][26][27] .Furthermore, a bioinformatics-based research method, including sample collection, genomic analysis, and identification of regulatory networks, can promote a better deciphering and understanding of the underlying pathological mechanisms of action of the immune system in OS [28][29][30] .Thus, identifying genes related to immune cell infiltration is important for improving the treatment and diagnosis of OS patients.
In this study, we conducted a comprehensive analysis to investigate the potential impact of ML-related genes as prognostic indicators.We have successfully uncovered novel insights into the survival-related ML-related genes of OS.Throughout this research, we have made several groundbreaking discoveries.Firstly, we have successfully pinpointed 435 DEGs between ML-related subgroups, with a majority of them being involved in immune-and inflammation-related pathways.Secondly, we have successfully devised a ML-related signature and established a scoring system that exhibits a significant correlation with the overall survival of OS patients.This unique www.nature.com/scientificreports/signature has proven to be highly effective in accurately stratifying patients into low-and high-risk groups, while also providing precise predictions for the overall survival of OS patients with sensitivity and specificity.
Our findings in the field of OS are consistent with the existing literature 31 .Previous study has reported similar patterns of gene expression alterations and identified signature genes classifier in OS patients.Our study further supported the notion that the signature genes play crucial roles in OS development and progression.Furthermore, our enrichment analyses results provided additional insights into the potential mechanisms of these signature genes, highlighting their potential as therapeutic targets in OS treatment.In our research, the immunocyte infiltration level in OS patients was assessed through the MCP-counter.Our results indicated that a low level of monocytic lineage (ML) was related to a bad prognosis in OS patients.Monocytes are important immune cells and important regulators of cancer initiation and progression 32 .The www.nature.com/scientificreports/monocyte-directed adjuvant therapies had the potential value in the treatment of cancer 33,34 .A recent study showed that increased peripheral blood monocytes counts were associated a poorer prognosis in pancreatic cancer patients 35 .Immune cell infiltration involves in OS metastasis, and patrolling monocytes suppressed OS Another important finding is that we established a prognostic diagnostic model for OS patients.We identified 435 ML-related genes, and three genes (TERT, CCDC26, and IL2RA) were used to construct the prognostic model.Telomerase (TERT) was a catalytic subunit of telomerase, which involves in tumorigenesis 37 .TERT gene exerted carcinogenic effect, targeting TERT was an effective therapy in the treatment of non-small cell lung  www.nature.com/scientificreports/cancer 38 .TERT could potentially function as a valuable genomic indicator for detecting and forecasting various types of cancer, while also holding promise as a potential target for therapeutic interventions in the case of OS 39 .In addition, TERT was a potential prognostic biomarker in diagnosis and prognosis of cancer, including breast cancer, thyroid cancer, and bladder cancer [40][41][42] .Moreover, the suppression of TERT expression decreased osteosarcoma cell metastasis, motility, and proliferation 39 .At present, the role of Coiled-coil domain containing (CCDC26) in cancer prognosis remains unexplored.CCDC26 knockdown could cause imatinib resistance in gastrointestinal stromal tumor cells through decreasing c-KIT expression 43 .CCDC26 rs4295627 polymorphism was a risk marker for glioma patients 44,45 .Interleukin 2 receptor subunit alpha (IL2RA) was increased after stimulation in immune cells, such as regulatory T cells 46 .Recent studies revealed that IL2RA was closely related to the development and progression of tumorigenesis and the prognosis of cancer patients.IL2RA suppressed differentiation and contributed to stem cell-related properties, implying that it was a potential target in acute myeloid leukemia 47 .In addition, elevated mRNA expression of IL2RA was an adverse prognostic biomarker in acute myeloid leukemia 48 .Higher expression of IL2RA was related to the poorer prognosis and higher immune infiltration level in pancreatic ductal adenocarcinoma 49,50 .IL2RA is a molecule that is relatively recent in its discovery, with only limited studies on its involvement in OS currently available.However, it has been found to play a significant part in the development of tumors.Additional investigations are warranted to thoroughly investigate the extent of its impact on OS.In our study, the prognostic model containing TERT, CCDC26, and IL2RA genes, and it showed good prognostic performance, as well as for the prediction of metastasis in OS patients.Therefore, this prognostic risk model is helpful for the diagnosis and treatment of OS patients.There were several limitations in this study.The study primarily focused on data mining and data analysis, the results was not validated by external experiments.Further research is needed to validate our findings.

Conclusion
In general, two ML-associated molecular subtypes and a ML-related signature (TERT, CCDC26, and IL2RA) were identified for the establishment of the prognostic model.The low infiltration level of ML was related to a bad prognosis and inactivated immune status in OS patients.Moreover, the ML-related prognostic model could predict prognosis and metastasis in OS patients.

Figure 1 .
Figure 1.The ML was related to a bad prognosis in patients with OS. (A) The univariate Cox analysis of six immune cells and two stromal cells based on TARGET database.(B) The Kaplan-Meier analysis showed that the level of ML was significantly associated with prognosis in patients.

Figure 2 .
Figure 2. Identification of DEGs between the LML and HML subgroups.(A) The volcano plot depicted the DEGs between the LML and HML subgroups.The green dots represented down-regulated genes, whereas the red dots represented up-regulated genes.(B) Heatmap plot depicted the top 50 DEGs between the LML and HML subgroups.

Figure 3 .
Figure 3. Enrichment analysis of DEGs.(A) Bubble plots depicted the results of GO and KEGG.The network diagram depicted the immune-and inflammation-related GO-BP terms (B) and KEGG pathways (C).The blue nodes represented GO-BP terms or KEGG pathways, red nodes represented genes involved in the pathways.

Figure 4 .
Figure 4.The landscape of immunocyte infiltration levels in the two subgroups.(A) The comparisons of stromal score, immune score, ESTIMATE score, and tumor purity between the LML and HML subgroups.(B) The comparisons of immunocyte infiltration levels between the LML and HML subgroups.*p < 0.05, **p < 0.01, and ***p < 0.001.

Figure 6 .
Figure 6.Assessment of prognostic model in the TARGET database.(A) The expression level of TERT, CCDC26 and IL2RA (below), survival status (middle), and the distribution of risk scores between low and high-risk groups (upper).(B) Survival analysis of showed the difference between low and high-risk groups.(C) Time-dependent ROC curve analyses of the prognostic model.

Figure 7 .
Figure 7. Validation of prognostic model in the GSE21257 dataset.(A) The expression level of TERT, CCDC26 and IL2RA (below), survival status (middle), and the distribution of risk scores between low and high-risk groups (upper).(B) Survival analysis of showed the difference between low and high-risk groups.(C) Timedependent ROC curve analyses of the prognostic model.

Figure 8 .Figure 9 .
Figure 8. Establishment of the nomogram.(A) Metastasis and risk score were used to establish the nomogram.(B) Calibration curve of the nomogram.

Figure 10 .
Figure10.Genes associated with risk models were expressed in cells that are relevant to the tumor microenvironment.The expression levels of TERT (A) and IL2RA (B) in OS microenvironment-related cells were visualized using a heatmap in the dataset GSE162454.

Figure 11 .
Figure 11.Assessment of the ability of prognostic model to predict metastasis in patients with OS. (A) The comparisons of metastasis and no metastasis cases in the low and high-risk groups.(B) The comparisons of risk score in the metastatic and non-metastatic groups.(C) ROC analysis for the diagnostic performance in the prediction of OS metastasis.